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Abstract. In these notes, we consider the problem of finding the logarithm or the square 
root of a real matrix. It is known that for every real n x n matrix. A, if no real eigenvalue 
of A is negative or zero, then A has a real logarithm, that is, there is a real matrix, X, such 
that = A. Furthermore, if the eigenvalues, ^, of X satisfy the property — vr < < vr, 
then X is unique. It is also known that under the same condition every real nxn matrix. A, 
has a real square root, that is, there is a real matrix, X, such that = A. Moreover, if the 
eigenvalues, pe*^, of X satisfy the condition — | < 6* < |, then X is unique. These theorems 
are the theoretical basis for various numerical methods for exponentiating a matrix or for 
computing its logarithm using a method known as scaling and squaring (resp. inverse scaling 
and squaring). Such methods play an important role in the log-Euclidean framework due to 
Arsigny, Fillard, Pennec and Ayache and its applications to medical imaging. Actually, there 
is a necessary and sufficient condition for a real matrix to have a real logarithm (or a real 
square root) but it is fairly subtle as it involves the parity of the number of Jordan blocks 
associated with negative eigenvalues. As far as I know, with the exception of Higham's recent 
book [17] , proofs of these results are scattered in the literature and it is not easy to locate 
them. Moreover, Higham's excellent book assumes a certain level of background in linear 
algebra that readers interested in the topics of this paper may not possess so we feel that 
a more elementary presentation might be a valuable supplement to Higham [17]. In these 
notes, I present a unified exposition of these results and give more direct proofs of some of 
them using the Real Jordan Form. 
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1 Jordan Decomposition and Jordan Normal Form 



The proofs of the theorems stated in the abstract make heavy use of the Jordan normal 
form of a matrix and its cousin, the Jordan decomposition of a matrix into its semihnear 
part and its nilpotent part. The purpose of this section is to review these concepts rather 
thoroughly to make sure that the reader has the background necessary to understand the 
proofs in Section [2] and Section [31 We pay particular attention to the Real Jordan Form 
(Horn and Johnson [19j, Chapter 3, Section 4, Theorem 3.4.5, Hirsh and Smale [18] Chapter 
6) which, although familiar to experts in linear algebra, is typically missing from "standard" 
algebra books. We give a complete proof of the Real Jordan Form as such a proof does not 
seem to be easily found (even Horn and Johnson [19] only give a sketch of the proof but it 
is covered in Hirsh and Smale p^. Chapter 6). 

Let be a finite dimensional real vector space. Recall that we can form the complex- 
ification, Vc, of V. The space Vc is the complex vector space, V x V, with the addition 
operation given by 

(M1, Vi) + (m2, V2) = {Ui + U2, Vi + V2), 

and the scalar multiplication given by 

(A + i/i) ■ (m, v) = (Am — yUf , /im + \v) (A, /i G M). 

Obviously 

so every vector, {u,v) G Vc, can written uniquely as 

{u,v) = {u,0) + i- {v,0). 

The map from \^ to Vc given by m ^— {u, 0) is obviously an injection and for notational 
convenience, we write (m, 0) as u, we suppress the symbol ("dot") for scalar multiplication 
and we write 

(m, v) = u + iv, with u,v E V. 
Observe that if (ei, . . . , e„) is a basis of V, then it is also a basis of Vc- 
Every linear map, /: V ^V, yields a linear map, /c: Vc — Vc, with 

/c(m + iv) = f{u) + if{v), for all u,v 

Definition 1.1 A linear map, /: V V semisimple iff fc can be diagonalized. In terms 
of matrices, a real matrix. A, is semisimple iff there are some matrices D and P with entries 
in C, with P invertible and D a diagonal matrix, so that A = PDP^^. We say that / is 
nilpotent iff Z*" = for some positive integer, r, and a matrix. A, is nilpotent iff A*" = for 
some positive integer, r. We say that / is unipotent iff / — id is nilpotent and a matrix A is 
unipotent iff A — / is nilpotent. 
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If A is unipotent, then A = I + N where is nilpotent. If r is the smallest integer so 
that = (the index of nilpotency of A^), then it is easy to check that 



I-N + N^ + '-' + i-iy-^N'' 



r—l 



is the inverse oi A = I + N. 

For example, rotation matrices are semisimple, although in general they can't be di- 
agonalized over M, since their eigenvalues are complex numbers of the form e*^. Every 
upper-triangular matrix where all the diagonal entries are zero is nilpotent. 

Definition 1.2 U f : V ^ V is a. linear map with V a finite vector space over M or C, a 
Jordan decomposition of / is a pair of linear maps, fs, /at : V ^ V, with fs semisimple and 
/tv nilpotent, such that 



The theorem below is a very useful technical tool for dealing with the exponential map. 
It can be proved from the so-called primary decomposition theorem or from the Jordan form 
(see Hoffman and Kunze |21j. Chapter 6, Section 4 or Bourbaki [7j, Chapter VII, §5). 

Theorem 1.3 If V is a finite dimensional vector space over C, then every linear map, 
f : V V , has a unique Jordan decomposition, f = fs + fN- Furthermore, fs and /tv can 
be expressed as polynomials in f with no constant term. 

Remark: In fact. Theorem 11.31 holds for any finite dimensional vector space over a perfect 
field, K (this means that either K has characteristic zero of that = K, where = 
{a^ I a G K} and where p > 2 is the characteristic of the field K). The proof of this 
stronger version of Theorem 11.31 is more subtle and involves some elementary Galois theory 
(see Hoffman and Kunze ^21j, Chapter 7, Section 4 or, for maximum generality, Bourbaki 
[T], Chapter VII, §5). 

We will need Theorem 11.31 in the case where \^ is a real vector space. In fact we need a 
slightly refined version of Theorem 11.31 for = M known as the Real Jordan form. First, let 
us review Jordan matrices and real Jordan matrices. 

Definition 1.4 A (complex) Jordan block is an r x r matrix, Jr{X), of the form 



f = fs + fN 



and 



fs ° fN = fN ° fs- 



1 

A 




1 











JrW 

















1 



A/ 



where A G C, with Ji(A) = (A) if r = 1. A real Jordan block is either 
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(1) a Jordan block as above with A G M, or 



(2) a real 2r x 2r matrix, J2r{\ A*), of the form 



/L(A,/x) 



J2r{\lj) 







L{X,n) I 





V 













\ 




I 



where L{\iJi) is a2 x 2 matrix of the form 



A —[i 

fj, A 

with A, /i G M, /X 7^ 0, with / the 2x2 identity matrix and with J2(A, /x) — L{X, /i) 
when r = 1. 

A (complex) Jordan matrix, J, is an n x n block diagonal matrix of the form 

fjr,(Xi) ■■■ \ 



J 



v 







where each Jr^{Xk) is a (complex) Jordan block associated with some A^ G C and with 
ri + ■ ■ ■ + r^ — n. A real Jordan matrix, J, is an n x n block diagonal matrix of the form 



\ 



\ 



where each Js^{ak) is a real Jordan block either associated with some = A^ G M as in (1) 
or associated with some = {Xk,iJ>k) G 1^^, with /x^ ^ 0, as in (2), in which case = 2rk- 



To simplify notation, we often write J(A) for Jr(A) (or J {a) for Js{a)). Here is an 
example of a Jordan matrix with four blocks: 



J = 



fx 


1 

















o\ 





A 


1 























A 


























A 


1 























A 


























A 


























1^ 


1 


^0 
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In order to prove properties of the exponential of Jordan blocks, we need to understand 
the deeper reasons for the existence of the Jordan form. For this, we review the notion of a 
minimal polynomial. 

Recall that a polynomial, p{X), of degree n > 1 is a monic polynomial iff the monomial 
of highest degree in p{X) is of the form X'^ (that is, the coefficient of X" is equal to 1). As 
usual, let C[X] be the ring of polynomials 

p{X) = aoX" + oiX"-^ + • • • + an-iX + an, 

with complex coefficient, aj G M, and let ]R[X] be the ring of polynomials with real coeffi- 
cients, Oj e M. If V is a finite dimensional complex vector space and / : V" — > V is a given 
linear map, every polynomial 

p{X) = aoX" + aiX^-' + ■■■ + an-iX + a„, 

yields the linear map denoted where 

p{f){v) = aof\v) + aip-^{v) + ■■■ + an-if{v) + a^v, for every veV, 

and where f'^ — fo---ofis the composition of / with itself k times. We also write 

P{f) = aoF + air~^ H h On-i/ + a„id. 

Do not confuse p{X) and p{f). The expression p{X) denotes a polynomial in the "inde- 
terminate" X, whereas p{f ) denotes a linear map from V to V. 

For example, if p{X) is the polynomial 

p{X) =X^- 2X^ + SX-1, 

if A is any n x n matrix, then p{A) is the n x n matrix 

p{A) ^A^ -2A'^ + 3A-I 

obtained by formally substituting the matrix A for the variable X. 
Thus, we can define a "scalar multiphcation" , •: C[X] xV ^V,hy 

p{X)-v^p{f){v), vev. 

We immediately check that 

p{X)-{u + v) = p{X)-u + p{X)-v 
ip{X) + qiX))-u = p{X)-u + q{X)-u 
{p{X)q{X))-u = p{X)-{q{X).u) 
1 • u — u, 
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for all u,v & V and all p{X), q{X) e C[X], where 1 denotes the polynomial of degree with 
constant term 1. 

It follows that the scalar multiplication, • : C[X] xV^V, makes V into a C[X]-module 
that we will denote by Vf. Furthermore, as C is a subring of C[X] and as F is finite- 
dimensional, V is finitely generated over C and so Vf is finitely generated as a module over 
C[X]. 

Now, because V is finite dimensional, we claim that there is some polynomial, p{X), that 
annhilates Vf, that is, so that 

q{f){v) = 0, for all veV 

To prove this fact, observe that if V has dimension n, then the set of linear maps from V to 
V has dimension n^. Therefore any + 1 linear maps must be linearly dependent, so 

id, f,f,...,r' 

are linearly dependent linear maps and there is a nonzero polynomial, q{X), of degree at 
most so that q{f){v) = for all v & V. (In fact, by the Cayley-Hamilton Theorem, 
the characteristic polynomial, qf{X) — det(Xid — /), of / annihilates V, so there is some 
annihilating polynomial of degree at most n.) By abuse of language, if q{X) annihilates Vf, 
we also say that q{X) annhilates V. 

Now, the set of annihilating polynomials of V forms a principlal ideal in C[X], which 
means that there is a unique monic polynomial of minimal degree, pf, annihilating V and 
every other polynomial annihilating V is a multiple oi pf. We call this minimal monic 
polynomial annihilating V the minimal polynomial of /. 

The fact that V is annihilated by some polynomial in C[X] makes Vf a torsion C[X]- 
module. Furthermore, the ring C[X] has the property that every ideal is a principal ideal 
domain, abbreviated PID (this means that every ideal is generated by a single polynomial 
which can be chosen to be monic and of smallest degree). The ring R[X] is also a PID. In 
fact, the ring k[X] is a PID for any field, k. But then, we can apply some powerful results 
about the structure of finitely generated torsion modules over a PID to Vf and obtain various 
decompositions of V into subspaces which yield useful normal forms for /, in particular, the 
Jordan form. 

Let us give one more definition before stating our next important theorem: Say that V is 
a cyclic module iff V is generated by a single element as a C[X]-module, which means that 
there is some u eV so that u, f {u) , p{u) , . . . , {u) , . . . , generate V. 

Theorem 1.5 let V he a finite- dimensional complex vector space of dimension n. For every 
linear map, f: V ^V , there is a direct sum decomposition, 

y = 14 © 1/2 © • • • © Kz, 
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where each Vi is a cyclic <C[X]-module such that the minimal polynomial of the restriction 
of f to Vi is of the form {X — XiY\ Furthermore, the number, m, of subspaces Vi and the 
minimal polynomials of the Vi are uniquely determined by f and, for each such polynomial, 
{X — Xy , the number, rrii, of Vi's that have {X — Xy as minimal polynomial (that is, if 
X = Xi and r = Vi) is uniquely determined by f . 

A proof of Theorem 11.51 can be found in M. Artin [5], Chapter 12, Section 7, Lang [23j . 
Chapter XIV, Section 2, Dummit and Foote [I3], Chapter 12, Section 1 and Section 3, or 
D. Serre [22], Chapter 6, Section 3. A very good exposition is also given in Gantmacher 
[H], Chapter VII, in particular, see Theorem 8 and Theorem 12. However, in Gantmacher, 
elementary divisors are defined in a rather cumbersone manner in terms of ratios of deter- 
minants of certain minors. This makes, at times, the proof unnecessarily hard to follow. 

The minimal polynomials, {X — Xiy\ associated with the V^'s are called the elementary 
divisors of /. They need not be distinct. To be more precise, if the set of distinct elementary 
divisors of / is 

{(X-Air\...,(X-A,)^*} 

then (X — Ai)^i appears mi > 1 times, (X — A2)''^ appears m2 > 1 times, (X — A^)^* 
appears mt>l times, with 

mi + m2 + ■ ■ ■ + mt = m. 

The number, m^, is called the multiplicity of (X — Aj)^\ Furthermore, if (X — Ai)*"* and 
(X — Xjy^ are two distinct elementary divisors, it is possible that ^ rj yet Xi = Xj. 

Observe that (/ — Ajid)''' is nilpotent on Vi with index of nilpotency (which means that 
(/ — Ajid)''^ = onVi but (/ — Ajid)'"""^ 7^ on V^). Also, note that the monomials, (X — Aj), 
are the irreducible factors of the minimal polynomial of /. 

Next, let us take a closer look at the subspaces, Vi. It turns out that we can find a "good" 
basis of Vi so that in this basis, the restriction of / to Vi is a Jordan block. 

Proposition 1.6 Let V be a finite- dimensional vector space and let f : V ^ V be a linear 
map. If V is a cyclic C[X]-module and if (X — A)" is the minimal polynomial of f , then 
there is a basis of V of the form 

((/ - xidy-\u), if - xidy-\u), ...,(/- Aid)(w), w), 

for some u &V. With respect to this basis, the matrix of f is the Jordan block 

/A 1 ■■■ 0\ 
OA 1 ■■■0 



MX) 



Consequently, X is an eigenvalue of f . 



■■. 1 

VO ■■■A/ 
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Proof. Since K is a cyclic C[X]-module, there is some u & V so that V is generated by 
f{u), f^{u), . . ., which means that every vector in V is of the form p{f){u), for some 
polynomial, p{X). We claim that it, /(it), . . . , /""^(w), /""■^(it) generate V, which implies 
that 

u, if - Xid){u), ...,(/- Xidr-\u), if - Aid)--^(it) 

generate V. 

This is because if p{X) is any polynomial of degree at least n, then we can divide p{X) 
by (X — A)" obtaining 

p={X- XYq + r, 
where < deg(r) < n and as [X — A)" annihilates l^, we get 

p{f){u)^r{f){ul 

which means that every vector of the form p{f){u) with p{X) of degree > n is actually a 
linear combination of it, /(it), . . . , /"~^(it), /"~^(it). We can expand each (/ — Aid)''' using 
the binomial formula (because / commutes with itself and with id), so (/ — Aid) (it) is a 
linear combination of it, / (it), ... , /'^(it) and 

u, if - Aid)(«), ...,(/- Aid)"-2(ix), (/ - Xidr~\u) 

generate V. Furthermore, we claim that the above vectors are linearly independent. Indeed, 
if we had a nontrivial linear combination 

ao(/ - Aid)"-^(it) + ai(/ - Xid)^-\u) + ■■■ + a„_2(/ - Aid)(it) + a„_iit = 0, 

then the polynomial 

ao{X - A)"-^ + ai{X - A)""' + • • • + a„_2(X - A) + a„_i 

of degree at most n — 1 would annihilate V, contradicting the fact that {X — A)" is the 
minimal polynomial of / (and thus, of smallest degree). Consequently, 

((/ - Xidr-\u), if - Aid)"-2(i.), ...,(/- Xid){u),u), 
is a basis of V and since u, f{u), . . . , /"~^(it), f^~^{u) span l^, 

(it, /(it),..., /"-»,/"-») 
is also a basis of V . Let us sec how / acts on the basis 

((/ - Aid)«-^(it), (/ - Aid)"-2(it), ...,(/- Aid) (it). It). 

If we write / = / — Aid + Aid, as (/ — Aid)" annihilates V, we get 

/((/ - Aid)"-i(it)) = (/ - Aid)"(it) + A(/ - Aid)"-i(it) = A(/ - X\dr-\u) 
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and 



/((/ - Aid)'^H) = (/ - \id)'+\u) + A(/ - \id)\u) 



< A; < n - 2. 



But this means precisely that the matrix of / in this basis is the Jordan block J„(A). □ 
Using Theorem 11.51 and Proposition 11.61 we get the Jordan form for complex matrices. 

Theorem 1.7 (Jordan Form) For every complex n x n matrix, A, there is some invertible 
matrix, P, and some Jordan matrix, J, so that 



If {Ai,...,As} is the set of eigenvalues of A, then the diagonal elements of the Jordan 
blocks of J are among the Aj and every Aj corresponds to one of more Jordan blocks of J. 
Furthermore, the number, m, of Jordan blocks, the distinct Jordan block, Jr-{Xi), and the 
number of times, rrii, that each Jordan block, Jr-{Xi), occurs are uniquely determined by A. 

The number is called the multiplicity of the block J^.{\i). Observe that the column 
vector associated with the first entry of every Jordan block is an eigenvector of A. Thus, the 
number, m, of Jordan blocks is the number of linearly independent eigenvectors of A. 

Beside the references that we cited for the proof of Theorem II. 5[ other proofs of Theorem 
II. 71 can be found in the literature. Often, these proofs do not cover the uniqueness statement. 
For example, a nice proof is given in Godement [12], Chapter 35. Another interesting proof 
is given in Strang [27], Appendix B. A more "computational proof" is given in Horn and 
Johnson, Chapter 3, Sections 1-4. 

Observe that Theorem 11.71 implies that the charateristic polynomial, qf{X), of f is the 
product of the elementary divisors of / (counted with their multiplicity). But then, qf{X) 
must annihilate V. Therefore, we obtain a quick proof of the Cayley Hamilton Theorem (of 
course, we had to work hard to get Theorem 11.71 ). Also, the minimal polynomial of / is the 
least common multiple (1cm) of the elementary divisors of /. 

The following technical result will be needed for finding the logarithm of a real matrix: 

Proposition 1.8 If J is a2nx 2n complex Jordan matrix consisting of two conjugate blocks 
Jn{\ + i/i) and J„(A — i/i) of dimension n (ft 0), then there is a permutation matrix, P, 
and matrix, E, so that 



A = PJP 



J = PEP 



where E is a block matrix of the form 



/D I 
D I 







E 




\0 
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and with D the diagonal 2x2 matrix 



D 



\ + ifj, 
X — ijj, 



Furthermore, there is a complex invertible matrix, Q, and a real Jordan matrix, C , so that 

J = QCQ-\ 



where C is of the form 



/L I 
L I 




VO 







I 



with 



Proof. First, consider an example, namely. 



A —fi 
II A 



J 



fX + iH 1 \ 

A + i/x 

X-in 1 

X-ifxJ 





V 

If we permute rows 2 and 3, we get 

fX + i/j, 



\ 

and we permute columns 2 and 3, we get our matrix. 



1 

X — ill 
X + ifj, 




\ 

1 



X — inj 



E 



fX + ifx 1 \ 

A-i/i 1 

X + ifi 

\ X-inJ 



We leave it as an exercise to generalize this method to two n xn conjugate Jordan blocks to 
prove that wc can find a permutation matrix, P, so that E = P~^JP and thus, J = PEP~^. 



Next, as /X 7^ 0, the matrix L can be diagonalized and one easily checks that 

^ /X-l-iii II \ /—I I \ / A — (#\ /—I I \ ~^ 



X + i/i 
X — ijjL 



-i 1 
-i -1 



A f—i 
II X 



-i -1 
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Therefore, using the block diagonal matrix S = diag(S'2, . . . , S2) consisting of n blocks 



^2 

we see that 



-i 1 
-i -1 
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E = SCS- 

and thus, 

J = pscs-^p-\ 

which yields our second result with Q = PS. □ 

Proposition II .81 shows that every (complex) matrix. A, is similar to a real Jordan matrix. 
Unfortunately, if A is a real matrix, there is no guarantee that we can find a real invertible 
matrix, P, so that A = PJP~^, with J a real Jordan matrix. This result known as the Real 
Jordan Form is actually true but requires some work to be established. 

To the best of our knowledge, a complete proof is not easily found. Horn and Johnson 
state such a result as Theorem 3.4.5 in Chapter 3, Section 4, in [19]. However, they leave 
the details of the proof that a real P can be found as an exercise. We found that a proof 
can be obtained from Theorem 11.51 Since we believe that some of the techniques involved in 
this proof are of independent interest, we present this proof in full detail. It should be noted 
that we were inspired by some arguments found in Gantmacher il4j. Chapter IX, Section 
13. 



Theorem 1.9 (Real Jordan Form) For every real n x n matrix, A, there is some invertible 
(real) matrix, P, and some real Jordan matrix, J, so that 

A = PJp-\ 

For every Jordan block, Jr{X), of type (1), A is some real eigenvalue of A and for every 
Jordan block, J2r(A,/i), of type (2), X + ifi is a complex eigenvalue of A (with fi ^0). Every 
eigenvalue of A corresponds to one of more Jordan blocks of J. Furthermore, the number, 
m, of Jordan blocks, the distinct Jordan block, Jg^ca), O'nd the number of times, mi, that 
each Jordan block, Js-{c(i), occurs are uniquely determined by A. 

Proof. Let /: V —>■ V he the linear map defined by A and let fc be the complexification of 
/. Then, Theorem 11.51 yields a direct sum decomposition of Vc of the form 

Vc = Vi®---®Vm, (*) 

where each Vi is a cyclic C[X]-module (associated with fc) whose minimal polynomial is 
of the form (X — aiY\ where a is some (possibly complex) eigenvalue of /. If W is any 
subspace of Vc, we define the conjugate, W, of W by 

W = {u - iv e Vc \ u + iv e W}. 
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It is clear that is a subspace of Vc of the same dimension as W and obviously, Vc = Vc- 
Our first goal is to prove the following claim: 

Claim 1. For each factor, V^, the following properties hold: 

(1) \{ u + iv, fciu + iv), . . . , fl^ ^(m + if ) span V^-, then — /c(ii — it'), . . . , ^{u — iv) 
span Vi and so, Vi is cyclic with respect to /c- 

(2) If (X — aiY'- is the minimal polynomial of T^, then [X — aiY^ is the minimal polynomial 
oiVi. 

Proof of Claim 1 . As fciu + iv) = f{u)+if{v), we have fc{u — iv) = f{u) —if{v). It follows 
that f^{u + iv) = f^{u) + if^{v) and f^{u — iv) = f^{u) — if^{v), which implies that if Vj 
is generated by it + iv, fdu + iv), . . . , f^^ {u + iv) then Vj is generated by 
u — iv , fc{u — iv) , . . . , f^ {u — iv). Therefore, V j is cyclic for fc- 

We also prove the following simple fact: If 



then 

Indeed, we have 



(/c - (Aj + iiij)\(i){u + iv) = x + iy, 
ifc - (Aj - iiij)id){u -iv) =x- iy. 

x + iy = {fc - {Xj + ifij)id){u + iv) 

= fc{u + iv) - {Xj + inj){u + iv) 
= f{u) + if{v) - {Xj + iHj){u + iv) 

and by taking conjugates, we get 

x-iy = f{u)-if{v)-{Xj-iHj){u-iv) 
= fc{u - iv) - {Xj - iiij){u - iv) 
= (/c - {Xj - il^j)id) {u - iv) , 

as claimed. 

From the above, {fc — O-jid)^^ {x + iy) = iff {fc — ajid)^^{x — iy) = 0. Thus, {X — ajid)^^ 
annihilates Vj and as dimVj — dimT^ and Vj is cychc, we conclude that {X — ajY^ is the 
minimal polynomial of Vj. □ 

Next we prove 

Claim 2. For every factor, Vj, in the direct decomposition (*), we have: 
(A) If {X — XjY^ is the minimal polynomial of Vj, with Xj G M, then either 

(1) Vj — Vj and \iu-\-iv generates Vj, then u — iv also generates Vj, or 
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(2) Vj n Vj = (0) and 

(a) the cyclic space Vj also occurs in the direct sum decomposition (*) 

(b) the minimal polynomial of Vj is {X — Xj Y^ 

(c) the spaces Vj and Vj contain only complex vectors (this means that if x + iy G Vj, 
then X 7^ and y ^ and similarly for Vj). 

(B) If (X — (Aj + ifij)Y^ is the minimal polynomial of Vj with Hj ^ 0, then 

(d) V, n V, = (0) 

(e) the cyclic space Vj also occurs in the direct sum decomposition (*) 

(f) the minimal polynomial of Vj is (X — (Aj — ifJ^j))^^ 

(g) the spaces Vj and Vj contain only complex vectors. 

Proof of Claim 2. By taking the conjugate of the direct sum decomposition (*) we get 

V^c = ^l©■■■©V^m• 
By Claim 1, each Vj is a cyclic subspace with respect to fc of the same dimension as Vj and 
the minimal polynomial of Vj is (X — ajY^ if the minimal polynomial of Vj is (X — OjY^ . 
It follows from the uniqueness assertion of Theorem 11.51 that the list of conjugate minimal 
polynomials 

(X-«i)^S...,(X-a^)^'" 
is a permutation the list of minimal polynomials 

{X-a^Y\...,{X-aJ'-"^ 

and so, every Vj is equal to some factor Vk (possibly equal to Vj if aj is real) in the direct 
decomposition (*), where Vk and Vj have the same minimal polynomial, (X — ajY^- 

Next, assume that (X — XjY^ is the minimal polynomial of Vj, with Aj G M. Consider 
any generator, u + iv, for Vj. If u — iv & Vj, then by Claim 1, Vj C Vj and so Vj = Vj, 
as dim Vj = dim Vj. We know that u + iv , fc{u + iv) , . . . , f^^ {u + iv) generate Vj and that 
u — iv, fc{u — iv), . . . , fl^ {u — iv) generate V j = Vj, which implies (1). 

If u — iv ^ Vj, then we proved earlier that Vj occurs in the direct sum (*) as some Vk 
and that its minimal polynomial is also (X — Xj)^^ . Since u — iv ^ Vj and Vj and Vj belong 
to a direct sum decomposition, Vj fl Vj = (0) and 2(a) and 2(b) hold. If m G Vj- or iv G Vj 
for some real u ^ V or some real v & V and u, f 7^ 0, as V^- is a complex space, then v & Vj 
and either u EVj or v e Vj, contradicting Vj fl Vj = (0). Thus, 2(c) holds. 

Now, consider the case where aj = Xj + ifij, with fij 7^ 0. Then, we know that Vj = Vk 
for some Vk whose minimal polynomial is (X — {aj —i^j)Y^ in the direct sum (*). As ^j 7^ 0, 
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the cyclic spaces Vj and Vj correspond to distinct minimal polynomials (X — {aj + ifij)Y^ 
and (X — (oj —ifij)Y^, so Vj fl Vj = (0). It follows that Vj and Vj consist of complex vectors 
as we already observed. Therefore, (d), (e), (f), (g) are proved, which finishes the proof of 
Claim 2. □ 

We now show how to produce some linearly independent vectors in V so that the matrix 
of / over these vectors is a real Jordan block. 

(B) First, consider the case where the minimal polynomial of Vj is (X — {Xj + ifJ^j))'^^ with 
/i, ^ 0. 

By Claim 1(1), if m + iv generates Vj, then u — iv generates V j and by Proposition II. 5[ 
the subspace Vj has a basis (ui + Wi, . . . , + ivr^) and the subspace V j has a basis 
(mi - wi, . . . , Ur^ - ivr^), with 

Uk + = ifc - (Aj + iHj)idp'''{u + iv), 1 < k <rj 
and Uk, 7^ for A; = 1, . . . , Vj. Recall that 

/c(mi + ivi) = {Xj + i^j){ui + ivi) 

fc{uk + ivk) = Uk-i + ivk-i + {Xj + ifij){uk + ivk), 2<k<rj. 
Thus, we get 

f{ui)+if{vi) = XjUi - fijVi + i{fijUi + XjVi) 

f{uk)+if{vk) = Uk-i + XjUk- iJ.jVk + i{vk-i + fXjUk + XjVk), 2<k<rj. 
which yields 

f{ui) = XjUi - fijVi 

f{vi) = fijUi + XjVi 

f{uk) = Uk-i + XjUk - jJjVk 

f{vk) = Vk-i + fXjUk + XjVk, 2<k< Tj. 

Now, {ui + ivi, . . . , + ivr^) form a basis of complex vectors of Vj, {ui — ivi, . . . , Uj.^ — ivr^) 
form a basis of complex vectors of Vj and Vj fl Vj = (0), so the vectors 

Ui,Vi,U2,V2, . . . ,Ur^,Vrj 

are linearly independent. Using these vectors, the matrix giving f{ui) and /(f i) over {ui, Vi) 
is 

and the matrix giving f{uk) and f{vk) over {uk-i,Vk-i,Uk,Vk) for /c > 2 is 

/ 1 0\ 
1 

Xj fij 
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This means that over the basis {ui, vi,U2,V2, ■ ■ ■ ,Ur ,Vr ), the restriction of / is a real Jordan 
block. 

(A) Let us now consider the case where the minimal polynomial of Vj is {X — XjY^ , with 
Xj G M. If Vj = Vj and if u+iv generates Vj, then Vj has the two bases {ui+ivi, . . . , Ur^+iVr^) 
and {ui — ivi, . . . , Ur^ — ivr^)., with 



Uk + ivk = ifc - Xjidy "(u + iv), 1 <k <rj. 

But then, either [ui, . . . , -i) are linearly independent or {vi, . . . , fr -i) are linearly inde- 
pendent. As Hj = 0, the computation in (B) yields 

f{ui) = XjUi 

f{vi) = XjVi 

f{uk) = Uk-i + XjUk 

f{vk) = Vk-i + XjVk, 2<k<rj. 

If (mi, . . . , Urj-i) is linearly independent we see that 

= XjUi 

and the matrix giving f{uk) over {uk-i, Uk) for > 2 is 

1 
A, 

This means that over the basis U2-, ■ ■ ■ , Ur -i), the restriction of / is a real Jordan block. 
If {vi, . . . ,Vrj-i) is linearly independent, then we obtain the same Jordan block. 

If Vj n Vj = (0) and if u + iv generates Vj, then, as in (B), Vj has the two bases 
(mi + ivi, . . . ,Ur^ + ivrj) and {ui - ivi, . . . , Ur^ - ivr^), with 

Uk + ivk = ifc - XjidY^'''{u + w), 1 <k <rj. 

Moreover, in this case, all vectors are complex. As a consequence, . . . , _i) and 
(fi, . . . ,Vrj-i) are linearly independent. Since the computation made in the previous case 
still holds (/ij = 0), we see that the restriction of / is a real Jordan block over the basis 

. . .,Ur^_i). 

Finally, by taking the union of all the real bases either associated with a conjugate pair 
(Vj, Vj) of with a subspace Vj corresponding to a real eigenvalue Xj, we obtain a matrix for 
/ which is a real Jordan matrix. □ 

Let A be a real matrix and let {X — aiY^ , . . . , {X — am)"^^ be its list of elementary 
divisors or, equivalently, let Jr^{ai), . . . , Jr^{am) be its list of Jordan blocks. If, for every 
Tj and every real eigenvalue Aj < 0, the number, rrii, of Jordan blocks identical to JrX^^i) is 
even, then there is a way to rearrange these blocks using the technique of Proposition 11.81 to 
obtain a version of the real Jordan form that makes it easy to find logarithms (and square 
roots) of real matrices. 
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Theorem 1.10 (Real Jordan Form, Special Version) Let A he a real n x n matrix and let 
(X — aiY'^, . . . , (X — am)'^^ be its list of elementary divisors or, equivalently , let Jr^{ai), . . ., 
Jrmi'^m) be its list of Jordan blocks. If, for every and every real eigenvalue < 0, the 
number, m^, of Jordan blocks identical to Jr-{<^i) is even, then there is a real invertible matrix, 
P, and a real Jordan matrix, J' , such that A = PJ'P"^ and 

(1) Every block, Jr-{ai), of J for which ctj G M and ai > is a Jordan block of type (1) of 
J' ( as in Definition 1.4\ ), or 

(2) For every block, J^-{ai), of J for which either G M and < or = \i + ifii with 
fii ^ (Xi, fii G M^, the corresponding real Jordan block of J' is defined as follows: 

(a) If fii 7^ 0, then J' contains the real Jordan block J2ri{K, f^i) of type (2) (as in 
Definition 1-4] ), or 

(b) If ai < then J' contains the real Jordan block J2ri(ai5 0) whose diagonal blocks 
are of the form 

'ai 



a, 



Proof . By hypothesis, for every real eigenvalue, a, < 0, for every r^, the Jordan block, 
Jr.-(ai), occurs an even number of times say 2tj, so by using a permutation, we may assume 
that we have ti pairs of identical blocks (Jrj(aj), Jr..(a;j)). But then, for each pair of blocks 
of this form, we can apply part (1) of Proposition 11.81 (since ai is its own conjugate), which 
yields our result. □ 



Remark: The above result generalizes the fact that when we have a rotation matrix, i?, the 
eigenvalues —1 occurring in the real block diagonal form of R can be paired up. 

The following theorem shows that the "structure" of the Jordan form of a matrix is 
preserved under exponentiation. This is an important result that will be needed to establish 
the necessity of the criterion for a real matrix to have a real logarithm. 

Theorem 1.11 For any (real or complex) n x n matrix. A, if A = PJP^^ where J is a 
Jordan matrix of the form 

Mi(Ai) ■•• \ 

V ■■■ JrJXm)) 

then there is some invertible matrix, Q, so that the Jordan form of is given by 

e^ = Qe{J)Q-\ 
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where e(J) is the Jordan matrix 

/^(e^i) ■■■ \ 

e(J)= ; ■■. ; 

that is, each Jj.^^e^'') is obtained from Jr^{\k) by replacing all the diagonal enties \k by e^^ . 
Equivalently, if the list of elementary divisors of A is 

{X — XiY^ , . . . , (X — Am)'"™, 

then the list of elementary divisors of is 

{X - e^^y\...,{X - e^^Y"^. 

Proof . Theorem 11.111 is a consequence of a general theorem about functions of matrices 
proved in Gantmacher [I^, see Chapter VI, Section 8, Theorem 9. Because a much more 
general result is proved, the proof in Gantmacher [T3] is rather involved. However, it is 
possible to give a simpler proof exploiting special properties of the exponential map. 

Let / be the linear map defined by the matrix A. The strategy of our proof is to go back 
to the direct sum decomposition given by Theorem 11.51 

V = Vi®V2®---®V^, 

where each Vi is a cyclic C[X] -module such that the minimal polynomial of the restriction 
of / to Vi is of the form (X — \iY\ We will prove that 



(1) The vectors 



u,ef{u),{ef)\u),...,{efY'-\u) 



form a basis of Vi (here, {e^Y = e-^ o ■ ■ ■ o e-^, the composition of with itself k times). 
(2) The polynomial {X — e^^Y' is the minimal polynomial of the restriction of e-^ to Vi. 

First, we prove that Vi is invariant under . Let N = f — Ajid. To say that {X — XiY^ 
is the minimal polynomial of the restriction of / to Vi is equivalent to saying that N is 
nilpotent with index of nilpotency, r = Vj. Now, N and Ajid commute so as / = X + Ajid, 
we have 

Furthermore, as N is nilpotent, we have 

X^ N''~^ 
= id + N + -- + --- + 



2! (r-1)!^ 
so 

eJ = e-^" id + X H r H h 



2! (r-1)! 
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Now, Vi is invariant under / so is invariant under N = f — AJd and this implies that Vi 
is invariant under e^^. Thus, we can view as a C[X]-module with respect to e-^. 

From the formula for e-^ we get 

- e^ud = e^M id + + — H h 7 rr 1 - e^ud 

\ 2! (r — Ij 



2! (r-1)! 
If we let 

A^ = Ar + — + ■■■ + 



2! (r-1)!' 
we claim that 

iV"-i = N'-^ and N"^ = 0. 

The case r = 1 is trivial so we may assume r > 2. Since N = NR for some R such that 
NR = RN and A^^' = 0, the second property is clear. The first property follows by observing 
that N = N + N'^T, where and T commute, so using the binomial formula, 

k=0 V ^ / k=0 V ^ / 

since 2r — k — 2 > r for < k < r — 2 and A^'" = 0. 
Recall from Proposition 11.61 that 

iif-XMr"\u),...,if-XM)iu),u) 

is a basis of Vi, which implies that N^~^{u) = (/ — Ajid)^'*^"'^(M) 7^ 0. Since A^^~^ = N^~^, we 
have N'^^^{u) 7^ and as A^'' = 0, we have N''"{u) = 0. It is well-known that these two facts 
imply that 

u,N{u),...,N'-\u) 
are linearly independent. Indeed, if we had a linear dependence relation 

aou + aiN{u) H h ar-iN^'-^iu) = 0, 

by applying A^''"-'^, as N^{u) = we get aQN'^~^{u) = 0, so, Oq = as N^~^{u) 7^ 0; by 
applying A^*""^ we get aiN^~^{u) = 0, so Oi = 0; using induction, by applying iV^-'=~2 

ak+iN''+\u) + ■■■ + ar-iN'^-^u) = 0, 
we get Ofc+i = for /c = 0, . . . , r — 2. Since Vi has dimension r (= rj), we deduce that 
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is a basis of Vi. But = e'^'(id + A^), so for /c = 0, . . . , r — 1, each N^'^u) is a linear 
combination of the vectors u, e-^('u), . . . , {e^Y~^{u) which imphes that 

is a basis of Vi. This imphes that any annihilating polynomial of has degree no less than 
r and since (X — e^^Y annihilates Vi, it is the minimal polynomial of V^. 

In summary, we proved that each is a cyclic C[X]-module (with respect to e^) and 
that in the direct sum decomposition 

V = Vi®---®Vm, 

the polynomial (X — e^^Y^ is the minimal polynomial of Vi, which is Theorem 11.51 for . 
Then, Theorem 11.111 follows immediately from Proposition 11.61 □ 

2 Logarithms of Real Matrices; Criteria for Existence 
and Uniqueness 

If A is any (complex) n x n matrix we say that a matrix, X, is a logarithm of A iff = A. 
Our goal is to find conditions for the existence and uniqueness of real logarithms of real 
matrices. The two main theorems of this section are Theorem 12.41 and Theorem 12. Ill These 
theorems are used in papers presenting methods for computing the logarithm of a matrix, 
including Cheng, Higham, Kenney and Laub and Kenney and Laub [22] . 

Reference cites Kenney and Laub [22] for a proof of Theorem 12 . 1 1 1 but in fact, that 
paper does not give a proof. Kenney and Laub [22] do state Theorem 12.111 as Lemma A2 
of Appendix A, but they simply say that "the proof is similar to that of Lemma Al". As 
to the proof of Lemma Al, Kenney and Laub state without detail that it makes use of the 
Cauchy integral formula for operators, a method used by DePrima and Johnson [12] to prove 
a similar theorem for complex matrices (Section 4, Lemma 1) and where uniqueness is also 
proved. Kenney and Laub point out that the third hypothesis in that lemma is redundant. 
Theorem 12.111 also appears in Higham's book [17j as Theorem 1.31. Its proof relies on 
Theorem 1.28 and Theorem 1.18 (both in Higham's book) but Theorem 1.28 is not proved 
and only part of theorem 1.18 is proved in the text (closer examination reveals that Theorem 
1.36 (in Higham's book) is needed to prove Theorem 1.28). Although Higham's Theorem 
1.28 implies the injectivity statement of Theorem 12.91 we feel that the proof of Theorem 12.91 
is of independent interest. Furthermore, Theorem 12.91 is a stronger result (it shows that exp 
is a diffeomorphism) . 

Given this state of affairs where no explicit proof of Theorem 12.111 seems easily available, 
we provide a complete proof Theorem 12.111 using our special form of the Real Jordan Form. 

First, let us consider the case where A is a complex matrix. Now, we know that if A = , 
then det(A) = e^^^^^ ^ 0, so A must be invertible. It turns out that this condition is also 
sufficient. 
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Recall that for every invertible matrix, P, and every matrix, A, 



and that for every block diagonal matrix. 



we have 







\ 



7 



\0 ■■■ 

Consequenly, the problem of finding the logarithm of a matrix reduces to the problem of 
finding the logarithm of a Jordan block Jr{a) with a ^ 0. However, every such Jordan block, 
Jr{c(), can be written as 

Jr{a) = al + H = al{l + a"^if), 
where H is the nilpotent matrix of index of nilpotency, r, given by 



/O 1 
1 





Vo 







1 

0/ 



Furthermore, it is obvious that N — a is also nilpotent of index of nilpotency, r, and we 
have 

Jr{a) = aI{I + N). 

Logarithms of the diagonal matrix, a/, are easily found. If we write a — pe*^ where 
p > 0, then logo; = logp + i{9 + 27r/i), for any h E and we can pick a logarithm of al to 
be 

fhgp + iO ••• \ 
log p + i9 ■ 



\ 











Observe that if we can find a logarithm, M, oi I + N 
as — al and — I + N, we have 







log p + i9j 
as S commutes with any matrix and 



S+M _ „SM 



- e^e^ = aI{I + N) = Jr{a), 
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which means that S + M is a logarithm of Jr{a). Therefore, the problem reduces to finding 
the logarithm of a unipotent matrix, I + N. However, this problem always has a solution. 
To see this, remember that for \u\ < 1, the power series 

iog(i+«) = «-^ + ^ + --- + (-ir^^ + --- 

is normally convergent. It turns out that the above fact can be generalized to matrices in 
the following way: 

Proposition 2.1 For every n x n matrix, A, such that \\A\\ < 1, the series 

a2 /i3 An 

iog(/+A) = A- — + — + --- + (-ir+i— + ••• 

16 n 
is normally convergent jor any norm || || on C" . Furthermore, if \\A\\ < 1, we have 

glog{/+A) = J + ^_ 



Remark: The formal power series, e^—I and log(J+yl) are mutual inverses, e^—I converges 
normally everywhere and log(/ + A) converges normally for \\A\\ < 1 so, there is some r, 
with < r < 1, so that 

log(e^) = A, if \\A\\ < r. 

For any given r > 1, the exponential and the logarithm (of matrices) turn out to give a 
homeomorphim between the set of nilpotent matrices, A^, and the set of unipotent matrices, 
I + N, for which A^*" = 0. Let N'il{r) denote the set of (real or complex) nilpotent matrices 
of any dimension n>l such that A^*" = and Uni{r) denote the set of unipotent matrices, 
U = I + N, where N e Mil{r). \iU = I + N e Um{r), note that log(/ + A^) is well-defined 
since the power series for log(/ + A^) only has r — 1 nonzero terms, 

log(/ + Ar) = Ar- — + — + ... + (-1)^—^. 

Proposition 2.2 The exponential map, exp: N'il{r) — > Uni{r), is a homeomorphism whose 
inverse is the logarithm. 

Proof . A complete proof can be found in Mmeimne and Testard [25], Chapter 3, Theorem 
3.3.3. The idea is to prove that 

log(e^) = N, for all A^ G Mil{r) and e'"^^^) = U, for all U G Uni{r). 

To prove the first identity, it is enough to show that for any fixed A^ G N'il{r), we have 

log(e*^) = tN, for all t G M. 
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To do this, observe that the functions t i— > tN and t log(e*^) are both equal to for t = 0. 
Thus, it is enough to show that their derivatives are equal, which is left as an exercise. 

Next, for any G A/'z/(r), the map 

t e'°s{^+*^) - (J + tAT), teR 

is a polynomial, since A^*" = 0. Furthermore, for t sufficiently small, ||tA^|| < 1 and in 
view of Proposition 12.11 we have e'"^^^"''*^'' = / + tN, so the above polynomial vanishes in a 
neighborhood of 0, which implies that it is identically zero. Therefore, e'"^^^"'"^'* = / + A^, as 
required. The continuity of exp and log is obvious. □ 

Proposition 12.21 shows that every unipotent matrix, I + N, has the unique logarithm 

iog{i+N) = N- — + — + ... + {-ir -, 

2 3 r — 1 

where r is the index of nilpotency of A^. Therefore, if we let M = log{I + N), we have finally 
found a logarithm, S + M, for our original matrix, A. As a result of all this, we have proved 
the following theorem: 

Theorem 2.3 Every n x n invertible complex matrix, A, has a logarithm, X . To find such 
a logarithm, we can proceed as follows: 

(1) Compute a Jordan form, A = PJP^^ , for A and let m be the number of Jordan blocks 
in J. 

(2) For every Jordan block, Jr^{ak), of J, write Jr^{aj) = akl{l + N/.), where N^ is 
nilpotent. 



(3) If ak = Pke'^'', with pk > 0, let 

/logpfe + iOk 

Sk = 



\ 










logpfc + i9k 




\ 



logpA; + iOkJ 



We have aki = e^*' . 
(4) For every Nk, let 



N? N? A^^'fe 
2 3 



We have I + Nk = e 



Ml. 
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(5) IfYk = Sk + Mk and Y is the block diagonal matrix diag(Yi, . . . , Ym), then 

X = PYP-^ 

is a logarithm of A. 

Let us now assume that A is a real matrix and let us try to find a real logarithm. There is 
no problem in finding real logarithms of the nilpotent parts but we run into trouble whenever 
an eigenvalue is complex or real negative. Fortunately, we can circumvent these problems 
by using the real Jordan form, provided that the condition of Theorem 11.101 holds. 

The theorem below gives a necessary and sufficient condition for a real matrix to have a 
real logarithm. The first occurrence of this theorem that we have found in the literature is a 
paper by Culver pjj published in 1966. The proofs in this paper rely heavily on results from 
Gantmacher [H]. Theorem 12.41 is also stated in Horn and Johnson [20] as Theorem 6.4.15 
(Chapter 6), but the proof is left as an exercise. We offer a proof using Theorem 11.101 which 
is more explicit than Culver's proof. 

Theorem 2.4 Let A he a real n x n matrix and let {X — aiY^ , . . . , {X — am)"^^ be its list 
of elementary divisors or, equivalently, let Jr^{ai), . . ., Jr^{oim) be its list of Jordan blocks. 
Then, A has a real logarithm iff A is invertible and if, for every and every real eigenvalue 
Ui < 0, the number, rrii, of Jordan blocks identical to Jr^o^i) is even. 

Proof . First, assume that A satisfies the conditions of Theorem 12. 4[ Since the matrix A 
satisfies the condition of Theorem 11.101 there is a real invertible matrix, P, and a real 
Jordan matrix, J', so that 

A = PJ'p-\ 

where J' satisfies conditions (1) and (2) of Theorem 11.101 As A is invertible, every block 
of J' of the form J^^{ak) corresponds to a real eigenvalue with > and we can write 
Jr^{aj) = akl{l + A^fc); where A^^^ is nilpotent. As in Theorem 12.31 (4), we can find a real 
logarithm, M^, of / + A"fc and as ctfc > 0, the diagonal matrix aki has the real logarithm 



Sk 



/logOfc ••• \ 
log Ofc • • ■ 

■•• loga^y 



Set Yk = Sk + Mk 



The other real Jordan blocks of J' are of the form J2rS^k-, f^k), with A^, fik ^ not both 
zero. Consequently, we can write 

-^2rfe(Afc, /ifc) = Dk + Hk = Dk{I + D^^Hk) 
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where 



with 



y ■■■ L{\k,Hk)J 



and Hk is a real nilpotent matrix. If we let A^^ = D^^Hk, then A^^ is also nilpotent, 
J2rk{^k, f^k) = Dk{I + Nk)-, and we can find a logarithm, Mfc, of / + A^^ as in Theorem 
12.31 (4). We can write + i^k = PfcC*^*, with > and 6k G [— vr, vr), and then 



L{Xk-, fJ'k) 



Pk 



cos 9k — sin 6k 
sin cos 9k 



If we set 



'S'(PA:, 9k) 



log pk -9k 
9 k logpfc 



a real matrix, we claim that 



L{Xk,Pk) = e'^'^''-\ 
Indeed, S{pk, 9k) = log pkl + 9kE2, with 



-1 

1 



and it is well known that 



E2 _ ( cos 9k - sin 6*^ 
sin 9k cos 9k 



so, as log pkl and 6'fci?2 commute, we get 



If we form the real block diagonal matrix, 

/ S{pk,9k) 

Sk = 



Pk 



\ 







COS 9k — sin 9k 
sin 6*/^ cos 



\ 

S{pk,9k)J 



L{\k, Pk) 



we have Dk = e^^ . Since Sk and commute and 

the matrix 1^ = 5'^ + Mk is a logarithm of J2ru{^ki Pk) ■ Finally, if Y is the block diagonal 
matrix diag(Y'i, . . . , Ym)-, then X = PYP^^ is a logarithm of A. 
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Let us now prove that if A has a real logarithm, X, then A satisfies the condition 
of Theorem 12. 4[ As we said before, A must be invertible. Since X is a real matrix, we 
know from the proof of Theorem 11.91 that the Jordan blocks of X associated with complex 
eigenvalues occur in conjugate pairs, so they are of the form 

Jrfe(afc) and Jr^'^k), ak = Xk + ijJ'k, f^k ^ 0. 

By Theorem II. IH the Jordan blocks oi A = are obtained by replacing each ak by e"* , 
that is, they are of the form 

Jrkie"'") and Jr^{e""'), ak = Xk + ifJ'k, fJ'k 0. 

If ak € M, then e"'' > 0, so the negative eigenvalues of A must be of the form e"* or 
e"*, with ak complex. This implies that ak = Xk + {2h + l)i7r, for some G Z, but then 
ak = Xk — {2h + l)in and so 

Consequently, negative eigenvalues of A are associated with Jordan blocks that occur in pair, 
as claimed. □ 



Remark: It can be shown (see Culver [TTj) that all the logarithms of a Jordan block, Jr^. (ak), 
corresponding to a real eigenvalue > are obtained by adding the matrices 

i2TThkI, hk G Z, 

to the solution given by the proof of Theorem 12.41 and that all the logarithms of a Jordan 
block, J2rk{c(k, Pk), are obtained by adding the matrices 

i2TThkI + 27clkE hk,lk&'^, 

to the solution given by the proof of Theorem \2A\ where 

/E2 ■■■ 0\ 



E 







E. 



E2 



V 



One should be careful no to relax the condition of Theorem 12.41 to the more liberal 
condition stating that for every Jordan block, Jr^(afc), for which ak < 0, the dimension 
is even [i.e, ak occurs an even number of times). For example, the following matrix 



A 



-1 1 







-1 



satisfies the more liberal condition but it does not possess any real logarithm, as the reader 
will verify. On the other hand, we have the following corollary: 
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Corollary 2.5 For every real invertible matrix, A, if A has no negative eigenvalues, then 
A has a real logarithm. 

More results about the number of real logarithms of real matrices can be found in Culver 
[llj . In particular, Culver gives a necessary and sufficient condition for a real matrix, A, to 
have a unique real logarithm. This condition is quite strong. In particular, it requires that 
all the eigenvalues of A be real and positive. 

A different approach is to restrict the domain of real logarithms to obtain a sufficient 
condition for the uniqueness of a logarithm. We now discuss this approach. First, we state 
the following property that will be useful later: 

Proposition 2.6 For every (real or complex) invertible matrix. A, there is as semisimple 
matrix, S, and a unilpotent matrix, U, so that 

A = SU and SU = US. 

Furthermore, S and U as above are unique. 

Proof. Proposition 12.61 follows immediately from Theorem 11.31 the details are left as an 
exercise. □ 

The form, SU, of an invertible matrix is often called the multiplicative Jordan decompo- 
sition. 

Definition 2.7 Let S{n) denote the set of all real matrices whose eigenvalues, A + ifi, lie in 
the horizontal strip determined by the condition —it < /i < vr. 

It is easy to see that S{n) is star-shaped (which means that if it contains A, then it 
contains XA for all A G [0, 1]) and open (because the roots of a polynomial are continuous 
functions of the coefficients of the polynomial). As S{n) is star-shaped, it is path-connected. 
Furthermore, if A G S{n), then PAP~^ G S{n) for every invertible matrix, P. The remark- 
able property of S{n) is that the restriction of the exponential to S{n) is a diffeomorphism 
onto its image. To prove this fact we will need the following proposition: 

Proposition 2.8 For any two real or complex matrices. Si and S2, if the eigenvalues, X+ifi, 
of Si and S2 satisfy the condition — tt < h < it , if Si and S2 are semisimple and if e^^ = e^^, 
then Si = 5*2. 

Proof . Since 5*1 and 5*2 are semisimple, they can be diagonalized over C, so let (mi, . . . , u„) 
be a basis of eigenvectors of 5*1 associated with the (possibly complex) eigenvalues Ai, . . . , A„ 
and let (fi, . . . ,f„) be a basis of eigenvectors of 5*2 associated with the (possibly complex) 
eigenvalues /ii, . . . , /i^. We prove that if e'^^ = e^^ = A, then Si{vi) = 5*2 (fj) for all Vi, which 
shows that Si = S2. 
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Pick any eigenvector, Vi, of 5*2 and write v = Vi and jji = Hi. We have 

V — aiUi H h a^Uk, 

for some unique aj. We compute A{v) in two different ways. We know that e^^, . . . , e'*" are 
the eigenvalues of e'^^ for the eigenvectors v\, . . . so 

A{v) = e'^^iv) = e^v = aie^ui H h aue^Uk. 

Similarly, we know that e'^i , . . . , e'^" are the eigenvalues of e'^^ for the eigenvectors Ui,...,Un, 
so 

= A{aiUi H h ccfeWfe) 

= aiA{ui) H h akA{uk) 

= Q;ie^^(iii) + --- + Q;fee^iK) 
= CKie'^'^iii H h ake^"Uk. 

Therefore, we deduce that 

a^e'^ = akC^^, 1 < k < n. 

Consequently, if a*; 7^ 0, then 

which implies /x — = i2TTh, for some h E Z. However, due to the hypothesis on the 
eigenvalues of Si and S2, and Aj must belong to the horizontal strip determined by the 
condition — tt < '^{z) < tt, so we must have h = and then /i — A^. If we let 
I — {k \ Xk — IJ>}, then v — X^fcg/ ctfe'^fe we have 



Si(v) = 5"! I ^akUk I 

= y^QfcAfcMfc 



fee/ 
fee/ 
fee/ 

Therefore, Si{v) — jiv. As ji is an eigenvector of S2 for the eigenvalue we also have 
S2{v) — jiv. Therefore, 

Si{vi) = S2{vi), i = l,...,n, 

which proves that Si — S2. □ 
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Obviously, Proposition 12.81 holds for real semisimple matrices, 81,82, in S{n), since the 
condition for being in S{n) is — vr < Q{a) < it for every eigenvalue, a, of Si or 82- 

We can now state our next theorem, an important result. This theorem is a consequence 
of a more general fact proved in Bourbaki |8j (Chapter III, Section 6.9, Proposition 17, see 
also Theorem 6). 

Theorem 2.9 The restriction of the exponential map to S{n) is a diffeomorphism of S{n) 
onto its image, exp{S{n)). If A ^ exp{S{n)) , then PAP~^ G S{n), for every (real) invert- 
ible matrix, P. Furthermore, exp(iS(n)) is an open subset o/GL(?2, M) containing I and 
exp(iS(n)) contains the open ball, B{1 , 1) = {A G GL(n, M) | \\A — /|| < 1}, for every norm 
II II on n X n matrices satisfying the condition \\AB\\ < \\A\\ \\B\\. 

Proof . A complete proof is given in Mmeimne and Testard [25], Chapter 3, Theorem 3.8.4. 
Part of the proof consists in showing that exp is a local diffeomorphism and for this, to prove 
that dexp{X) is invertible. This requires finding an explict formula for the derivative of the 
exponential and we prefer to omit this computation, which is quite technical. Proving that 
B[1, 1) C S{n) is easier but requires a little bit of complex analysis. Once these facts are 
established, it remains to prove that exp is injective on S{n), which we will prove. 

The trick is to use both the Jordan decomposition and the multiplicative Jordan de- 
composition! Assume that Xi,X2 G S{n) and that e^'^ = e^^. Using Theorem 11.31 can 
write Xi = + Ni and X2 = S2 + N2, where 5*1, S2 are semisimple, Ni, N2 are nilpotent, 
S^Ni = NiSi, and ^2A^2 = N2S2. From e^' = e^^, we get 

Now, Si and S2 are semisimple, so e^^ and e^^ are semisimple and Ni and N2 are nilpotent 
so e^^ and e^^ are unipotent. Moreover, as S'lA^'i = A''iS'i and 5'2iV2 = N2S2, we have 
gSigTVi _ gA^igS'i g^j^^ e'^^e^^ = e^'^e^^. By the uniqueness property of Proposition 12.61 we 
conclude that 

= e^' and e^' = 

Now, as A^i and A''2 are nilpotent, there is some r so that NI = N2 = and then, it is clear 
that e^^ = I + Ni and e^^ = I + N2 with = and N2 = 0. Therefore, we can apply 
Proposition 12.21 to conclude that 

A^i = N2. 

As ^i, ^2 G S{n) are semisimple and e"^^ = e^^, by Proposition 12. 8[ we conclude that 

Si = 5*2. 

Therefore, we finally proved that Xi = X2, showing that exp is injective on S{n). □ 

Remark: Since proposition 12.81 holds for semisimple matrices, S, such that the condition 
—71 < /U < TT holds for every eigenvalue, A + ifi, of S, the restriction of the exponential to 
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real matrices, X, whose eigenvalues satisfy this condition is injective. Note that the image of 
these matrices under the exponential contains matrices, A = , with negative eigenvalues. 
Thus, combining Theorem 12.41 and the above injectivity result we could state an existence 
and uniqueness result for real logarithms of real matrices that is more general than Theorem 
12. Ill below. However this is not a practical result since it requires a condition on the number 
of Jordan blocks and such a condition is hard to check. Thus, we will restrict ourselves to 
real matrices with no negative eigenvalues (see Theorem 12.111) . 

Since the eigenvalues of a nilpotent matrix are zero and since symmetric matrices have 
real eigenvalues. Theorem 12.91 has has two interesting corollaries. Denote by S(n) the vector 
space of real n x n matrices and by SPD(n) the set of n x n symmetric, positive, definite 
matrices. It is known that exp: S{n) —>■ SPD(n) is a bijection. 

Corollary 2.10 The exponential map has the following properties: 

(1) The map exp: Nil{r) — > Uni{r) is a diffeomorphism. 

(2) The map exp: S{n) —>■ SPD(n) is a diffeomorphism. 

By combining Theorem 12.41 and Theorem 12.91 we obtain the following result about the 
existence and uniqueness of logarithms of real matrices: 

Theorem 2.11 (a) If A is any real invertihlenxn matrix and A has no negative eigenvalues, 
then A has a unique real logarithm, X, with X G S{n). 

(h) The image, exp(5(n)), of S{n) by the exponential map is the set of real invertible ma- 
trices with no negative eigenvalues and exp: S{n) — > exp(5(n)) is a diffeomorphism between 
these two spaces. 

Proof, (a) If we go back to the proof of Theorem 12. 4^ we see that complex eigenvalues of 
the logarithm, X, produced by that proof only occur for matrices 



associated with eigenvalues + = Pk e ■ However, the eigenvalues of such matrices are 
log Pk ± iOk and since A has no negative eigenvalues, we may assume that — tt < 9k < n, and 
so X G S{n), as desired. By Theorem 12.91 such a logarithm is unique. 

(b) Part (a) proves that the set of real invertible matrices with no negative eigenvalues 
is contained in exp(iS(n)). However, for any matrix, X G S{n), since every eigenvalue of e"^ 
is of the form e'^'^^^ = e^e^^ for some eigenvalue, A + z/i, of X and since X + ip satisfies the 
condition — vr < p < tt, the number, e*'^, is never negative, so has no negative eigenvalues. 
Then, (b) follows directly from Theorem 12.91 □ 

Remark: Theorem 12. Ill (a) first appeared in Kenney and Laub [22] (Lemma A2, Appendix 
A) but without proof. 
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3 Square Roots of Real Matrices; Criteria for Exis- 
tence and Uniqueness 

In this section we investigate the problem of finding a square root of a matrix, A, that is, a 
matrix, X, such that X"^ = A. If A is an invertible (complex) matrix, then it always has a 
square root, but singular matrices may fail to have a square root. For example, the nilpotent 
matrix, 

/O 1 ■■■ 0\ 
1- 



H 





Vo 










1 

0/ 



has no square root (ckeck this!). The problem of finding square roots of matrices is thoroughly 
investigated in Gantmacher [H], Chapter VIII, Sections 6 and 7. For singular matrices, 
finding a square root reduces to the problem of finding the square root of a nilpotent matrix, 
which is not always possible. A necessary and sufficient condition for the existence of a 
square root is given in Horn and Johnson [20j, see Chapter 6, Section 4, especially Theorem 
6.1.12 and Theorem 6.4.14. This criterion is rather complicated because its deals with non- 
singular as well as singular matrices. In these notes, we will restrict our attention to invertible 
matrices. The main two Theorems of this section are Theorem 13.41 and Theorem 13. 8[ The 
former theorem appears in Higham [16j (Theorem 5). The first step is to prove a version of 
Theorem 11.111 for the function A^-^ A"^, where A is invertible. 

Theorem 3.1 For any (real or complex) invertible n x n matrix, A, if A = PJP^^ where 
J is a Jordan matrix of the form 



Ai(Ai) ••• \ 



J 



V ■■■ JrMm)/ 

then there is some invertible matrix, Q, so that the Jordan form of A? is given by 

e'' = Qs{J)Q-\ 

where s{J) is the Jordan matrix 

/J,, (A?) ••• \ 



siJ) 



\ 







JrJKn)J 



that is, each Jr^i^k) is obtained from Jr^.{^k) by replacing all the diagonal enties by A^. 
Equivalently, if the list of elementary divisors of A is 



(X - Ai)'-\ . . . , (X - A J' 
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then the list of elementary divisors of is 

Proof . Theorem 13 .11 is a consequence of a general theorem about functions of matrices proved 
in Gantmacher [H], see Chapter VI, Section 8, Theorem 9. However, it is possible to give a 
simpler proof exploiting special properties of the squaring map. 

Let / be the linear map defined by the matrix A. The proof is modeled after the proof 
of Theorem ll.lli Consider the direct sum decomposition given by Theorem 11.51 

V = Vi®V2®---®V^, 

where each is a cyclic C[X]-module such that the minimal polynomial of the restriction 
of / to is of the form (X — Aj)^'. We will prove that 

(1) The vectors 
form a basis of V^. 

(2) The polynomial (X — A^)*"' is the minimal polynomial of the restriction of to Vi. 

Since Vi is invariant under /, it is clear that Vi is invariant under = f of . Thus, we can 
view Vi as a C[X]-module with respect to Let N = f — Ajid. To say that (X — XiY' is the 
minimal polynomial of the restriction of / to is equivalent to saying that N is nilpotent 
with index of nilpotency, r = rj. Now, N and Ajid commute so as / = Ajid + N, we have 

/2 = A^id + 2XiN + 

and so 

/2 _ A^id = 2XiN + N^. 
Since we are assuming that / is invertible, A^ 7^ 0, so 

f - A^id = 2A. (x + g 

If we let 

X2 

we claim that 

The proof is identical to the proof given in Theorem 11.111 Again, as in the proof of 
Theorem II.IH we deduce that we have X^~^(u) 7^ and N''"{u) = 0, from which we infer 
that 

(w,x(«),...,x^-^H) 
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is a basis of Vi. But — Afid = 2XiN, so for k = 0, ... ,r — 1, each A^'^(m) is a linear 
combination of the vectors u, P{u)^ . . . , which imphes that 

{uj\u)j\u),...j'^^-'\u)) 

is a basis of Vi. This imphes that any annihilating polynomial of Vi has degree no less than 
r and since {X — XfY annihilates Vi, it is the minimal polynomial of Vi. Theorem 13.11 follows 
immediately from Proposition 11.61 □ 



Remark: Theorem 13.11 can be easily generalized to the map A i— > A^, for any p > 2, that 
is, by replacing A'^ by A^, provided A is invertible. Thus, if the list of elementary divisors 
of A is 

(X-Air,...,(X-A, 
then the list of elementary divisors of A^ is 

{x-x{r,...,{x-x^^ 



The next step is to find the square root of a Jordan block. Since we are assuming that 
our matrix is invertible, every Jordan block, Jrf.{ak), can be written as 

JrS'^k) = akI ( / + — ) , 

where H is nilpotent. It is easy to find a square root of aul. If Oik = Pk^^'', with pk > 0, 
then 

■■■ \ 



/ -Me' 2 



Sk 







pke 







V ■■■ y/p^e'-/ 

is a square root of akI. Therefore, the problem reduces to finding square roots of unipotent 
matrices. For this, we recall the power series 



'1 



1 + -X + 



n=0 



1 1 /I 



n\2 \2 
(2n)! 



n + 1 + 



(2n-l)(n!)222' 



■ X 



which is normally convergent for |x| < 1. Then, we can define the power series, R, of a 
matrix variable. A, by 



n=l 



(2n)! 

(2n- l)(n!)222" ' 
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and this power series converges normally for ||y4|| < 1. As a formal power series, note that 
-R(O) = and -R'(O) = | ^ so, by a theorem about formal power series, R has a unique 
inverse, S, such that 5(0) = (see Lang [2l] or H. Cartan [9J). But, if we consider the power 
series, S{A) = (/ + A)"^ — I, when A is a real number, we have R{A) = ^/l + A — 1, so we 
get 

R o S{A) = ^1 + (1 + A)2 - 1 -1 = A, 

from wich we deduce that S and R are mutual inverses. But, R converges everywhere and 
S converges for < 1, so by another theorem about converging power series, if we let 
y/I + A = R{A) + J, there is some r, with < r < 1, so that 

{Vl + Af = I + A, if ||A||<r 

and 

^y{I + Ay = I + A, if l|A||<r. 

If A is unipotent, that is, A = I + N with nilpotent, we see that the series has only finitely 
many terms. This fact allows us to prove the proposition below. 

Proposition 3.2 The squaring map, A i-^ A"^, is a homeomorphism from Uni(r) to itself 
whose inverse is the map A i— >• \/~A = R[A — I) + I. 

Proof. liA = I + N with N"^ = 0, as A^ = I + 2N + it is clear that (2N + N^Y = 0, so 
the squaring map is well defined on unipotent matrices. We use the technique of Proposition 
12.21 Consider the map 

t ^ {^I + tNf - [I + tN), teM. 

It is a polynomial since A^'^ = 0. Furthermore, for t sufficiently small, pA^|| < 1 and we 
have (a// + tNY = I + tN, so the above polynomial vanishes in a neighborhood of 0, which 
implies that it is identically zero. Therefore, (a// + A^)^ = / + A^, as required. 

Next, consider the map 

t ^ y/{I + tNy - (I + tN), teR. 

It is a polynomial since A^*" = 0. Furthermore, for t sufficiently small, ||tA^|| < 1 and we have 
^{i + tNy = i + tN, so we conclude as above that the above map is identically zero and 
that ^{I + Ny = I + N.n 

Remark: Proposition 13.21 can be easily generalized to the map A i-^ A^, for any p > 2, by 
using the power series 

(/ + A)^ = / + -A + ■ ■ ■ + f i - iV ■ Y- - n + 1 V" + ■ ■ ■ • 

p n\p \p J \p J 
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Using proposition 13.21 we can find a square root for tlie unipotent part of a Jordan block, 



li Nk = then 



Jr^{ak) = aki / H . 



oik ' 

is a square root of / + A'^fc. Therefore, we obtained the following theorem: 
Theorem 3.3 Every (complex) invertible matrix, A, has a square root. 

Remark: Theorem 13.31 can be easily generalized to p^^ roots, for any p>2, 

We now consider the problem of finding a real square root of an invertible real matrix. 
It turns out that the necessary and sufficient condition is exactly the condition for finding a 
real logarithm of a real matrix. 

Theorem 3.4 Let A be a real invertible nxn matrix and let {X — aiY^ , . . . , {X — am)"^^ be 
its list of elementary divisors or, equivalently, let Jr^(ai), . . Jr™,(«m) be its list of Jordan 
blocks. Then, A has a real square root iff for every Vi and every real eigenvalue at < 0, the 
number, rrii, of Jordan blocks identical to Jr-{<yi) is even. 

Proof . The proof is very similar to the proof of Theorem 12.41 so we only point out the 
necessary changes. Let J' be a real Jordan matrix so that 



-1 



A = PJ'P 

where J' satisfies conditions (1) and (2) of Theorem 11.101 As A is invertible, every block 
of J' of the form Jr^{ak) corresponds to a real eigenvalue with ak > and we can write 
Jrk{c(j) = (^kl{,l + Nk), where Nk is nilpotent. As in Theorem 13. 3[ we can find a real square 
root, Mfc, of / + A^fc and as a/c > 0, the diagonal matrix akI has the real square root 



Sh. 



l^k ■■■ \ 
J^k ■■■ 



\ ■■■ ^kj 



Set Yk = SkMk. 

The other real Jordan blocks of J' are of the form J2r^,{^k, fJ'k), with A^, fik E M, not both 
zero. Consequently, we can write 

J2a(Afc,/ifc) = Dk{I + Nk) 
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where 



fL{Xk,Hk) ■■■ \ 
■■■ L{\k,Hk)J 



with 



f^k Afc 



and Nk = ^Hk is nilpotent. We can find a square root, Mk, of I + Nk as in Theorem 13.31 
If we write + ifik = Pk^^^'', then 



Then, if we set 



a real matrix, we have 



L{\k, Pk) — Pk 



S{pk, Ok) 



cos 6k — sin 6k 
sin 6k cos 6k 



cos(f) -sin(f) 
sin(f) cos(f) 



L{Xk,Pk) — S{pk,6kY. 
If we form the real block diagonal matrix. 



/ S{pk, 6k 



Sk 



\ 



\^ ••• Sipk,6k)J 



we have Dk = Si and then the matrix = SkM^ is a square root of J2rk{^k, Pk)- Finally, 
if Y is the block diagonal matrix diag(Y'i, . . . , Y^), then X = PYP~^ is a square root of A. 

Let us now prove that if A has a real square root, X, then A satisfies the condition of 
Theorem 13.41 Since X is a real matrix, we know from the proof of Theorem 11.91 that the 
Jordan blocks of X associated with complex eigenvalues occur in conjugate pairs, so they 
are of the form 

Jrfc(afc) and Jrfc(afe), ^k = h + ipk, Pk 0- 



By Theorem 13. H the Jordan blocks oi A = X are obtained by replacing each by ai, that 
is, they are of the form 



Oik e 



Jr^{al) and Jr^ial), ak = Xk + ipk, Pk 7^ 0. 
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If ak G M, then a| > 0, so the negative eigenvalues of A must be of the form al or al, with 
ak complex. This implies that ak = ^/p^ e'2 , but then ak = A/Pfce~*2 and so 

2 2 

Consequently, negative eigenvalues of A are associated with Jordan blocks that occur in pair, 
as claimed. □ 

Remark: Theorem 13.41 can be easily generalized to p^^ roots, for any p >2, 

Theorem 13.41 appears in Higham [16] as Theorem 5 but no explicit proof is given. Instead, 
Higham states: "The proof is a straightfoward modification of Theorem 1 in Culver [TT] 
and is omitted." Culver's proof uses results from Gantmacher [13] and does not provide a 
constructive method for obtaining a square root. We gave a more constructive proof (but 
perhaps longer). 

Corollary 3.5 For every real invertible matrix, A, if A has no negative eingenvalues, then 
A has a real square root. 

We will now provide a sufficient condition for the uniqueness of a real square root. For 
this, we consider the open set, 7i(n), consisting of all real n x n matrices whose eigenvalues, 
a = X + ifi, have a positive real part, A > 0. We express this condition as 3fJ(a) > 0. 
Obviously, such matrices are invertible and can't have negative eigenvalues. We need a 
version of Proposition 12.81 for semisimple matrices in Tiin). 

Remark: To deal with p^^ roots, we consider matrices whose eigenvalues, pe*^, satisfy the 

condition -- <6 <-. 

p p 

Proposition 3.6 For any two real or complex matrices, Si and S2, if the eigenvalues, pe^^ , 
of Si and S2 satisfy the condition — | < < ^, if Si and S2 are semisimple and if Sf = 5*1, 
then 5*1 = 5*2. 

Proof . The proof is very similar to that of Proposition 12.81 so we only indicate where modifi- 
cations are needed. We use the fact that if u is an eigenvector of a linear map. A, associated 
with some eigenvalue. A, then u is an eigenvector of A"^ associated with the eigenvalue A^. 
We replace every occurrence of e'^' by A^ (and e'^ by /i^). As in the proof of Proposition 12. 8[ 
we obtain the equation 

aiH^Ui + ■ ■ ■ + akH^Uk = aiXfui + • ■ • + akX^Uk- 

Therefore, we deduce that 

ctfc/i^ = akX\, 1 < k < n. 
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Consequently, as /i, Xk ^ 0, if ak ^ 0, then 

//2 - \2 
^ — ^fc) 

which imphes /z = ±Afc. However, the hypothesis on the eigenvalues of Si and 5*2 implies 
that /i = Afe. The end of the proof is identical to that of Proposition 12. 8[ □ 

Obviously, Proposition 13.61 holds for real semisimple matrices, 5*1, 5*2, in 7i{n). 
Remark: Proposition 13.61 also holds for the map S ^ S^, for any p > 2, under the condition 

<e< ^. 

p — p 

We have the following analog of Theorem 12. 9[ but we content ourselves with a weaker 
result: 

Theorem 3.7 The restriction of the squaring map, A i-^ , to l-iin) is infective. 

Proof. Let Xi,X2 G T-C{n) and assume that Xf = X|. As Xi and X2 are invertible, by 
Proposition 12.61 we can write Xi = Si{I + Ni) and X2 = S2{I + N2), where 81,82 are 
semisimple, A^i,iV2 are nilpotent, S'i(/ + A^i) = (/ + A^i)^! and 82(1 + N2) = (/ + A^2)5'2. As 
Xf = X|, we get 

8l{I + Nif = 8^{I + N2f. 

Now, as 5*1 and 5*2 are semisimple and invertible, 8f and S"! are semisimple and invertible, 
and as Ni and A'"2 are nilpotent, 2A''i + Nf and 2N2 + N2 are nilpotent, so {I + NiY and 
{I + N2f are unipotent. Moreover, + A^i) = (/ + Ni)8i and 82{I + N2) = {I + N2)82 
imply that 8l{I + A^^^ = (/ + Nif8l and + A'2)^ = (/ + N2f8l Therefore, by the 
uniqueness statement of Proposition 12.61 we get 

8l = 8l and {I + N^f = {I + N2f . 

However, as Xi,X2 G 7i{n) we have 81, 82 G Tiin) and Proposition 13.61 implies that 81 = 82- 
Since I + Ni and I + N2 are unipotent, proposition 13.21 implies that Ni = N2. Therefore, 
Xi = X2, as required. □ 

Remark: Theorem 13.71 also holds for the restriction of the squaring map to real or complex 
matrices, X, whose eigenvalues, pe*^, satisfy the condition — f < ^ < f. This result is proved 
in DePrima and Johnson [12j by a different method. However, DePrima and Johnson need 
an extra condition, see the discussion at the end of this section. 

We can now prove the analog of Theorem 12.111 for square roots. 

Theorem 3.8 If A is any real invertible n x n matrix and A has no negative eigenvalues, 
then A has a unique real square root, X, with X G Ti.{n). 



37 



Proof. If we go back to the proof of Theorem 13.41 we see that complex eigenvalues of the 
square root, X, produced by that proof only occur for matrices 



Q( 0^ /7r/^^os(f) -sin(^)\ 



associated with eigenvalues Xk + i^k = Pk e*^*. However, the eigenvalues of such matrices are 
^Jpke ~ and since A has no negative eigenvalues, we may assume that — vr < 6k < ir, and 
so — I < Y < f 5 wich means that X G H{n), as desired. By Theorem 13. 7[ such a square 
root is unique. □ 

Theorem [33] is stated in a number of papers including Bini, Higham and Meini [6J, Cheng, 
Higham, Kenney and Laub [lOj and Kenney and Laub ^22j. Theorem 13.81 also appears in 
Higham [17J as Theorem 1.29. Its proof relies on Theorem 1.26 and Theorem 1.18 (both in 
Higham's book), whose proof is not given in full (closer examination reveals that Theorem 
1.36 (in Higham's book) is needed to prove Theorem 1.26). Although Higham's Theorem 
1.26 implies our Theorem 13. 71 we feel that the proof of Theorem 13.71 is of independent interest 
and is more direct. 

As we already said in Section [2], Kenney and Laub [22j state Theorem 13.81 as Lemma Al 
in Appendix A. The proof is sketched briefly. Existence follows from the Cauchy integral 
formula for operators, a method used by DePrima and Johnson [12] in which a similar result 
is proved for complex matrices (Section 4, Lemma 1). Uniqueness is proved in DePrima and 
Johnson [12j but it uses an extra condition. The hypotheses of Lemma 1 in DePrima and 
Johnson are that A and X are complex invertible matrices and that X satisfies the conditions 

(i) X' = A, 

(ii) the eigenvalues, pe*^, of X satisfy — | < 6* < |, 

(iii) For any matrix, S, if AS = SA, then XS = SX. 

Observe that condition (ii) allows 6 = ^, which yields matrices, A = X^, with negative 
eigenvalues. In this case, A may not have any real square root but DePrima and Johnson are 
only concerned with complex matrices and a complex square root always exists. To guarantee 
the existence of real logarithms, Kenney and Laub tighten condition (ii) to — | < 6' < ^. 
They also assert that condition (iii) follows from conditions (i) and (ii). This can be shown as 
follows: First, recall that we have shown that uniqueness follows from (i) and (ii). Uniqueness 
under conditions (i) and (ii) can also be shown to be a consequence of Theorem 2 in Higham 
[TB] . Now, assume X"^ = A and 5*^4 = SA. We may assume that S is invertible since the set 
of invertible matrices is dense in the set of all matrices. Then, as SA = AS, we have 

{SXS-y = SX^S-^ = SAS-^ = A. 

Thus, SXS'-^ is a square root of A. Furthermore, X and SXS"-^ have the same eigenvalues 
so SXS^^ satisfies (i) and (ii) and, by uniqueness, X = SXS^^, that is, XS = SX. 
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Since Kenney and Laub only provide a sketch of Theorem Al and since Higham [T7] does 
not give all the details of the proof either, we felt that the reader would appreciate seeing a 
complete proof of Theorem 13. 8[ 

4 Conclusion 

It is interesting that Theorem 12.111 and Theorem 13.81 are the basis for numerical methods for 
computing the exponential or the logarithm of a matrix. The key point is that the following 
identities hold: 

= (e^/2'^')2" and \og{A) = 2Hog{A^'^'), 

where in the second case, A^/^' is the unique fcth square root of A whose eigenvalues, pe , 
lie in the sector —■^<6<-^. The first identity is trivial and the second one can be shown 
by induction from the identity 

log(A) = 21og(Ai/2), 

where A^^"^ is the unique square root of A whose eigenvalues, pe*^, lie in the sector 
— I < 9 < |. Let X = A^/^, whose eigenvalues, pe*^, lie in the sector — | < ^ < |. Then, it 
is easy to see that the eigenvalues, a, of log(X) satisfy the condition — f < '^{a) < |. Then, 
X = 21og(X) = 21og(A^/2) satisfies 

^ glog(Ai/2)+log{Ai/2) ^ glog(Ai/2)^log(Ai/2) ^ ^1/2^1/2 ^ 

and the eigenvalues, a, of X satisfy the condition — tt < < n so, by the uniqueness 

part of Theorem 12.111 we must have log(74) = 21og(y4^/^). 

The identity log(74) = 2^\og{A^^'^'') leads to a numerical method for computing the 
logarithm of a (real) matrix first introduced by Kenney and Laub known as the inverse scaling 
and squaring algorithm, see Kenney and Laub [22] and Cheng, Higham, Kenney and Laub 
[To]. The idea is that if A is close to the identity, then log{A) can be computed accurately 
using either a truncated power series expansion of log(y4) or better, rational approximations 
know as Fade approximants . In order to bring A close to the identity, iterate the operation 
of taking the square root of A to obtain A^^'^ . Then, after having computed \og{A^^'^ ), 
scale log(A^/^ ) by the factor 2^^. For details of this method, see Kenney and Laub p2j and 
Cheng, Higham, Kenney and Laub [10]. The inverse squaring and scaling method plays an 
important role in the log-Euclidean framework introduced by Arsigny, Fillard, Pennec and 
Ayache, see Arsigny [1] , Arsigny, Fillard, Pennec and Ayache [21 [3] and Arsigny, Pennec and 
Ayache [1]. 

References 

[1] Vincent Arsigny. Processing Data in Lie Groups: An Algebraic Approach. Application to 
Non-Linear Registration and Diffusion Tensor MRL PhD thesis, Ecole Polytechnique, 
Palaiseau, France, 2006. These de Sciences. 



39 



[2] Vincent Arsigny, Pierre Fillard, Xavier Pennec, and Nicholas Ayache. Log-euclidean 
metrics for fast and simple calculus on diffusion tensors. Magnetic Resonance in 
Medicine, 56(2):411-421, 2006. 

[3] Vincent Arsigny, Pierre Fillard, Xavier Pennec, and Nicholas Ayache. Geometric means 
in a novel vector space structure on symmetric positive-definite matrices. SIAM J. on 
Matrix Analysis and Applications, 29(l):328-347, 2007. 

[4] Vincent Arsigny, Xavier Pennec, and Nicholas Ayache. Polyrigid and polyaffine trans- 
formations: a novel geometrical tool to deal with non-rigid deformations-application to 
the registration of histological shces. Medical Image Analysis, 9(6):507-523, 2005. 

[5] Michael Artin. Algebra. Prentice Hall, first edition, 1991. 

[6] Dario A. Bini, Nicholas J. Higham, and Beatrice Meini. Algorithms for the matrix pth 
root. Numerical Algorithms, 39:349-378, 2005. 

[7] Nicolas Bourbaki. Algehre, Chapitres 4-7. Elements de Mathematiques. Masson, 1981. 

[8] Nicolas Bourbaki. Elements of Mathematics. Lie Groups and Lie Algebras, Chapters 
1-3. Springer, first edition, 1989. 

[9] Henri Cartan. Theorie elementaire des fonctions analytiques d'une ou plusieurs variables 
complexes. Hermann, 1961. 

[10] Sheung H. Cheng, Nicholas J. Higham, Charles Kcnncy. and Alan J. Laub. Approximat- 
ing the logarithm of a matrix to specified accuracy. SIAM Journal on Matrix Analysis 
and Applications, 22:1112-1125, 2001. 

[11] Walter J. Culver. On the existence and uniqueness of the real logarithm of a matrix. 
Proc. Amer. Math. Soc, 17:1146-1151, 1966. 

[12] C. R. DePrima and C. R. Johnson. The range of A~^A* in GL(n, C). Linear Algebra 
and Its Applications, 9:209-222, 1974. 

[13] David S. Dummit and Richard M. Foote. Abstract Algebra. Wiley, second edition, 1999. 

[14] F.R. Gantmacher. The Theory of Matrices, Vol. I AMS Chelsea, first edition, 1977. 

[15] Roger Godement. Cours dAlgebre. Hermann, first edition, 1963. 

[16] Nicholas J. Higham. Computing real square roots of a real matrix. Linear Algebra and 
its Applications, 88/89:405-430, 1987. 

[17] Nicholas J. Higham. Functions of Matrices. Theory and Computation. SIAM, first 
edition, 2008. 



40 



[18] Morris W. Hirsh and Stephen Smale. Differential Equations, Dynamical Systems and 
Linear Algebra. Academic Press, first edition, 1974. 

[19] Roger A. Horn and Charles R. Johnson. Matrix Analysis. Cambridge University Press, 
first edition, 1990. 

[20] Roger A. Horn and Charles R. Johnson. Topics in Matrix Analysis. Cambridge Uni- 
versity Press, first edition, 1994. 

[21] Hoffman Kenneth and Kunze Ray. Linear Algebra. Prentice Hall, second edition, 1971. 

[22] Charles S. Kenney and Alan J. Laub. Condition estimates for matrix functions. SI AM 
Journal on Matrix Analysis and Applications, 10:191-209, 1989. 

[23] Serge Lang. Algebra. Addison Wesley, third edition, 1993. 

[24] Serge Lang. Complex Analysis. GTM No. 103. Springer Verlag, fourth edition, 1999. 

[25] R. Mneimne and F. Testard. Introduction a la Theorie des Groupes de Lie Classiques. 
Hermann, first edition, 1997. 

[26] Denis Serre. Matrices, Theory and Applications. GTM No. 216. Springer Verlag, first 
edition, 2002. 

[27] Gilbert Strang. Linear Algebra and its Applications. Saunders HBJ, third edition, 1988. 



41 



